For this project, we will work with a subset of the NYC restaurant inspections dataset accessed from https://dev.socrata.com/foundry/data.cityofnewyork.us/9w7m-hzhe in October 2017.
We will limit the data to restarants with an inspection grade of “A,” “B,” or “C,” and with known borough.
restaurant_inspections = read_csv("DOHMH_New_York_City_Restaurant_Inspection_Results.csv.gz",
col_types = cols(building = col_character()),
na = c("NA", "N/A")) %>%
filter(grade %in% c("A", "B", "C"), boro != "Missing") %>%
mutate(boro = str_to_title(boro))
Below we present boxplots of the distributions of inspection scores by borough. We observe that accross all boroughs the data are substantially right skewed.
restaurant_inspections %>%
group_by(boro) %>%
plot_ly(y = ~score, color = ~boro, type = "box")
## Warning: Ignoring 3 observations
Below we present a plot showing the frequency of “A,” “B,” and “C” grades received in each borough.
restaurant_inspections %>%
group_by(boro) %>%
count(grade) %>%
ungroup() %>%
mutate(boro = fct_reorder(boro, n)) %>%
plot_ly(x = ~boro, y = ~n, color = ~grade, type = "bar")
Below, we present a time series plot showing the inspection score over time for each restaurant where one line represents one restaurant. We observe substantial instability in inspection score over time. Baseline inspection score does not appear to predict future inspection score.
restaurant_inspections %>%
group_by(camis) %>%
filter(!duplicated(inspection_date)) %>%
select(camis, inspection_date, score) %>%
plot_ly(x = ~inspection_date, y = ~score, group_by = ~camis,
type = 'scatter',
alpha = 0.25,
mode = 'lines+markers') %>%
layout(xaxis = list(range = c(as.Date('2012-01-01'), as.Date('2019-01-01'))))
## Warning: 'scatter' objects don't have these attributes: 'group_by'
## Valid attributes include:
## 'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'ids', 'customdata', 'hoverinfo', 'hoverlabel', 'stream', 'x', 'x0', 'dx', 'y', 'y0', 'dy', 'text', 'hovertext', 'mode', 'hoveron', 'line', 'connectgaps', 'cliponaxis', 'fill', 'fillcolor', 'marker', 'textposition', 'textfont', 'r', 't', 'error_y', 'error_x', 'xaxis', 'yaxis', 'xcalendar', 'ycalendar', 'idssrc', 'customdatasrc', 'hoverinfosrc', 'xsrc', 'ysrc', 'textsrc', 'hovertextsrc', 'textpositionsrc', 'rsrc', 'tsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule'